Understanding how people with Parkinson's disease turn in gait from a real-world in-home dataset

Introduction Turning in gait digital parameters may be useful in measuring disease progression in Parkinson's disease (PD), however challenges remain over algorithm validation in real-world settings. The influence of clinician observation on turning outcomes is poorly understood. Our objective is to describe a unique in-home video dataset and explore the use of turning parameters as biomarkers in PD. Methods 11 participants with PD, 11 control participants stayed in a home-like setting living freely for 5 days (with two sessions of clinical assessment), during which high-resolution video was captured. Clinicians watched the videos, identified turns and documented turning parameters. Results From 85 hours of video 3869 turns were evaluated, averaging at 22.7 turns per hour per person. 6 participants had significantly different numbers of turning steps and/or turn duration between “ON” and “OFF” medication states. Positive Spearman correlations were seen between the Movement Disorders Society-sponsored revision of the Unified Parkinson's Disease Rating Scale III score with a) number of turning steps (rho = 0.893, p < 0.001), and b) duration of turn (rho = 0.744, p = 0.009) “OFF” medications. A positive correlation was seen “ON” medications between number of turning steps and clinical rating scale score (rho = 0.618, p = 0.048). Both cohorts took more steps and shorter durations of turn during observed clinical assessments than when free-living. Conclusion This study shows proof of concept that real-world free-living turn duration and number of turning steps recorded can distinguish between PD medication states and correlate with gold-standard clinical rating scale scores. It illustrates a methodology for ecological validation of real-world digital outcomes.

measuring aspects of turns potentially of specific use in clinical trials of disease-modifying interventions which typically recruit recently diagnosed patients [10].
The current gold-standard clinical rating scale used in clinical trials to measure mobility outcomes, the Movement Disorders Societysponsored revision of the Unified Parkinson's Disease Rating Scale motor sub-score (MDS-UPDRS III) [11] has limitations including related to its 'snapshot' nature which cannot capture real-world functional turning performance of patients [12], its non-linear and discontinuous scoring system, the inter-rater variability [13] and Hawthorne effect [14] of being observed on how someone mobilizes [15,16].
To overcome this, much work has been put into developing technological sensors, particularly wearable devices, which can detect [17] and quantify [18][19][20][21][22] aspects of how someone turns in gait, with some of this work done in home settings. However, even though wearable algorithms evaluating gait have been shown to poorly translate from lab to home [9] and we know that people mobilise differently between laboratory and home settings [20,21], the challenge of 'ecological validation' (whether the findings of the research can be applied to real world situations) remains: without a clinician present, or video cameras recording in-home, the "ground truth" of what is actually happening when the sensor predicts a turn, is missing. Attempts at ground truth have been made using clinician-held video cameras in laboratory [23] and home [24], but this does not overcome the problem that gait performance in PD can change purely due to being observed by a clinician/researcher [16]. Work has been done showing acceptability of using such high-resolution video recording for validation purposes in home settings in PD [25,26].
Other unmet needs in this space include: identifying which of the many gait parameters are most useful as surrogate markers of overall mobility in PD, and for which patient sub-groups within PD [21]; understanding what the parameters tell us (e.g. rates of progression and medication status); and more fully understanding the impact of clinician observation and/or setting on gait in PD. Additionally, there has been little work looking at turning in the practically defined "OFF" versus "ON" medication state in PD, which is particularly relevant to neuroprotective/neurorestorative trials.
The novelty of this work is in the collection of unobtrusive video data providing ground truth from a home-like setting, where most of the participant data is with no researcher present. Secondly, this dataset has been watched and annotated (where turns are detected and aspects quantified) post-hoc by clinician raters manually, providing a rich resource with which to understand real-world turning behavior. Thirdly, this data affords the comparison between human-observed (structured) and unobserved (unstructured) turning, from the same home-like setting. Controlling for the setting could give specific insight into the impact of clinician observations on gait.
We hypothesized that real-world in-home turning parameters would show promise in discriminating between PD and control participants and between the "ON" and "OFF" medication state in PD. Furthermore, we hypothesized that turning parameters would have face validity in free-living, shown by correlations with the current gold-standard clinical assessments. A sub-objective was to explore the difference in how someone turns during a clinical assessment compared to living freely. The long-term goal is to advance digital measures of mobility which could serve as critical markers of disease progression for use in clinical trials; and secondly to show a method to ecologically validate such mobility biomarkers.

Participants
22 participants were recruited, 11 of whom had a clinical diagnosis of PD according to UK Brain Bank Criteria, with a Modified Hoehn and Yahr Scale score of 3 or less in the "OFF" medications state. This was a convenience sample size chosen to explore which parameters showed proof of concept and feasibility as potential markers of disease change in Parkinson's from in-home real-world data. The control cohort, included to show comparison to PD and to increase the safety and enjoyment of all in the study, consisted of spouses, close family members and friends of the PD participants; the participants took part in pairs (one person with PD, one control participant). Informed consent was gained from all participants. Full approvals (NHS Research Ethics Committee and Health Research Authority) were confirmed on 14th January 2020. The CONSORT diagram of recruitment is in the Multimedia Appendix.

Data collection
Each pair of participants stayed in an instrumented house for 5 days/ 4 nights continuously. Wall-mounted video cameras in the kitchen, hall, dining room and living recorded video for around 2 h a day at varying times according to participant preference with a resolution of 640/480 pixels and a frame rate of 30 frames/second (see study protocol [27]). During each pair's time in the house lived freely apart from when researchers visited (twice) to conduct clinical assessments. On one testing occasion, the PD participants undertook assessments twice: first in the practically defined "OFF" medications state then in the "ON" state having taken sufficient medication to achieve this.
The clinical assessments conducted by the clinician researcher included: • MDS-UPDRS motor sub-scale • A 20-m gait evaluation (in a 5-m corridor where the person turns 180 • 3 times) performed at least 3 times each in the "ON" and "OFF" medications state for the participant with PD.
PD participants withheld their long-acting dopaminergic medication for 24 h and short-acting agents for 12 h before the practically-defined "OFF" data capture, including clinical assessments, on day 4. Participants with deep brain stimulators switched stimulation off 1 h prior to the "OFF" medication period clinical evaluation.
The "ON" medications video data was captured over around 6 h per participant pair split across 3 days, whereas the "OFF" medications data was from around 2 h of video on a single day (day 4). The timings were chosen for pragmatic reasons because of the participant discomfort burden of being "OFF" medications limiting such data captured to 1 day only, and the ethical approval to record 2 h of video each day during the study. The numbers of turns in each medication state are shown in Table 3.

Annotations
The videos were watched post-hoc by medical doctors who had undertaken training in the MDS-UPDRS rating score and had an interest in movement disorders; various aspects of the participants' movement were identified and quantified, producing "annotations" (a set of parameter outcomes for each episode of turning of gait) [28]. A widely-available software called ELAN was used to watch up to 4 simultaneously-captured video files at a time. A pre-prepared annotation template was used by both clinician raters, with controlled vocabularies in drop-down menus to reduce the variability in the annotations created. The parameters annotated were: turning angle estimation (90 • -360 • in 45 • increments), number of turning steps (integers from 1 to 18), duration of turn (seconds:milliseconds), type of turn (pivot turn, step turn).
A turning episode was defined as.
• Starting from the initiation of rotation of the pelvis, ending in completion of movement • Not a turn taken in a walking arc (e.g. walking around a table) • Clearly visible from the video Where the participants' feet were visible in the video frame, the number of turning steps was counted. In terms of type of turn: a pivot turn was classified as a turn in one or both feet swivel in place to achieve the turning movement; a step turn was classified as a turn achieved by three steps or more without pivoting.

Inter-rater agreement
Two clinicians annotated 50% of the turns each. Around 50% of the total number of annotations were cross-checked (randomly selecting 6 pairs from 11) by both clinician annotators, blinding the cross-checking clinician to the turning annotations produced by the other. Cohen's Kappa [29] statistic was calculated to evaluate inter-rater reliability. Any discrepancies were recorded, discussed, and resolved by the clinician raters, and with a final review by a movement disorders specialist.

Statistical approach
To investigate the correlations between the MDS-UPDRS III total score and the turning parameters, Spearman's rank correlation coefficients were used. To evaluate mean group differences (between PD and control; PD "ON" and "OFF" medication; free-living and clinical assessment turns), Wilcoxon rank-sum [30] tests were utilised. Because most turns taken during the clinical assessments were over a 180 • angle (as shown in Table 1), only the 180 • turns from both sets were used for statistical analysis comparing clinical assessment with free-living turns.

Results
The demographics and clinical characteristics for the 22 participants are shown in Table 1. The mean ages were 61.9 (PD) and 59.2 (controls) years.

Numbers of turns captured
In total, 85.0 h of video footage was captured and annotated. Table 2 shows that from this data 3869 turns were observed in total and that over half of these were turns over a 90 • angle. There were 22.7 turns per hour per person on average.

Inter-rater agreement
The two clinician raters had an almost perfect [31] inter-rater agreement for turning angle (Cohen's kappa = 0.96) and number of turning steps (Cohen's kappa = 0.97) annotations. Table 3 shows that 6 out of 11 participants with PD had an increase in mean number of turning steps when "OFF" medications compared to when "ON" medications, and 5 out of 11 showed an increase in mean turn duration in the same comparison (looking at all turning angles, from free-living data only)

"ON" and "OFF" medication state in the participant with PD
One participant (PD 8) showed fewer turning steps and a shorter mean turning duration when "ON" medications compared to "OFF" medications, and PD 9 also had a shorter mean duration of turn when "ON" medications compared to "OFF".
Two PD participants (PD 2 and PD 5, the 2 subjects with the longest duration of disease) show significantly more step turns taken when they are in the "OFF" medication state compared to when they were "ON" medications: PD 2% of step turns from total turns "ON" = 19%, "OFF" = 93%, p-value < 0.001; PD 5% of step turns from total turns "ON" = 31%, "OFF" = 65%, p-value < 0.001.

PD vs. control
The PD cohort (n = 11) had a significantly higher average number of turning steps (mean = 2.95, 95% confidence interval [2.87, 3.03], SD =  Table 4 shows that there is a strong correlation between the mean number of turning steps and mean turn duration in the PD cohort when the participants are free-living "ON" medications (rho = 0.897). It also shows a significant positive correlation between the MDS-UPDRS motor Clin Ax = Clinical Assessments. ID = identification; DD = PD disease duration; N = number of turning episodes; *p < 0.05, **p < 0.01, ***P < 0.001.

Table 4
Table showing correlations between "ON" and "OFF" medication turning parameters from the PD participant cohort in free-living and clinical rating outcomes (all turning angles). *p < 0.05, **p < 0.01, ***P < 0.001. a Survived false discovery rate adjustment using Benjamini-Hochberg (B-H) procedure.
sub-score (III) and the mean number of turning steps in free-living when the participant is "ON" (rho = 0.618). There is a significant correlation between mean number of turning steps and turn duration in the PD participants while "OFF" medication in free-living (rho = 0.642). There is a strong positive correlation between the MDS-UPDRS III motor subscore and both the mean number of turning steps (rho = 0.893) and mean turn duration (rho = 0.744) "OFF" medications. False discovery rate adjustment was performed for dependent tests using the Benjamini-Hochberg procedure [32]. Where p-values have been corrected, they are reported alongside the number of dependent tests. The positive correlations are visualized in scatter plots in Fig. 1. Table 5 shows the comparison between free-living and clinical assessment turning over 180 • . The PD participant cohort took significantly more steps to turn during clinical assessments (mean = 4.75 [4.47, 5.04]) compared to free-living (mean = 3.58 [3.36, 3.81], n = 11, W = 27304, p < 0.001). The increase in mean number of turning steps was also seen with statistical significance in six individual participants with PD; six PD participants also had a shorter mean turn duration in clinical assessments compared to free-living.

Clinical assessments vs. free-living
The control participant cohort also took more turning steps on average to make a 180 • turn when undertaking clinical assessment (mean = 4.34 [3.95, 4.72]) compared to free-living (mean = 2.72 [2.59, 2.85], W = 21228, p < 0.001). 5 control individuals showed significantly increased mean number of turning steps while 1 (C3) conversely took fewer steps during clinical assessments. The control cohort overall also had shorter mean turn durations when they were being observed during clinical assessment compared to free-living (p < 0.001) and 8 control participants supported this trend at an individual level.

Discussion
This is the first work to show that it is feasible to quantify reliable parameters of free-living turns (number of turns, number of turning steps, duration of turn, turning angle) from wall-mounted cameras in a home-like setting in PD. We have shown that these turning parameters can differentiate between PD and control participants and they show significant individual-level differences between the "ON" and "OFF" medication state in several PD participants. Additionally, there are strong correlations between the turning parameters and the MDS-UPDRS III scores in the PD cohort, particularly marked when "OFF" medication (the first time such a comparison has been done from inhome free-living data). An unexpected but potentially impactful finding is the presence and scale of change in turning behaviors when doing observed clinical assessments compared to "unobserved" (for these purposes, meaning no clinician physically present) free-living in this real-world setting.
In this study, we have demonstrated that people make >20 turns on average per person per hour while free-living. Over a longer time monitoring in-home turning, the resulting dataset will potentially be rich and informative about naturalistic mobility outcomes. This is important as indoor turning outcomes can complement straight-ahead gait parameters [21,24], which are better measured outdoors (due to space restraints in most homes), to achieve a 24-h view of free-living mobility. 58% of the turns seen are taken over 90 • , which is important since the majority of observed turns in clinic are over 180 • -therefore we could be missing valuable gait-related information in clinic.
Five participants with PD showed an increase in both the average number of turning steps and average duration of turn "OFF" medications, in comparison with their "ON" medication turns (one further PD participant showed this trend of increasing only number of turning steps "OFF" medications). Such changes show the promise of these turning parameters to detect dopamine-related gait fluctuations in PD. Two PD participants (PD 8 and PD 9) showed the opposite trends: their duration of turn (both participants) and number of turning steps (only PD 8) reduced "OFF" medications. There are several potential explanations for these unpredicted findings, including the fact that both participants noted significant issues with orthostatic hypotension in the "ON" state which may have led to them turning slower "ON" medications. PD 8 also reported excessive daytime sleepiness (affecting mobility) in the morning, associated with sleep disturbance, and had slept worse according to their study diary in the nights before their "ON" medication data was collected than before their "OFF" medication period. Two participants with the greatest duration of PD took significantly more step turns while "OFF" medications compared to their "ON" medication state, in a reflection of how step turns appear later in the disease process and therefore this difference was 'unmasked' only in these two people with dopaminergic medications were withheld. The difference between individuals, and how medication-responsive their mobility outcomes were, highlights the compelling argument to stratify patients into their PD phenotypes in clinical trials to better appreciate the individual significance of mobility-related outcomes [33].
This work found, at a cohort level, that the PD participants take more turning steps and have longer durations of turn compared to the control participants (although a limitation is that the male/female ratios differ between the two groups). This supports previous works evaluating how turning differs between PD and control in the clinic [22]. However, to truly understand how free-living turning differs between PD and non-PD participants, larger cohorts are needed to reduce the impact of potential confounders (e.g. age and physical activity level). The parameters in this study are a subset of the potential turning outcomes which could be evaluated; they are particularly feasible to accurately rate by a human watching video, hence why they are reported on in this study. Future work could look at other aspects of turning such as gait stability in turning.
It is well-understood that (straight-ahead) gait parameters differ between the laboratory and home settings in PD [14,15]. However, this study adds information to this understanding in two ways: firstly, by showing marked differences in gait outcomes when all observations were made in a home-like "real-world" (not laboratory) setting, and secondly by looking specifically at changes in turning of gait as opposed to straight-ahead gait episodes. Differences seen between turning parameters in free-living compared to clinical assessments in this study may have occurred for a number of reasons: the Hawthorne effect of clinician and video recording; an increasing familiarity with tasks repeated several times; the single-task nature of the clinical assessment *P < 0.05, **P < 0.01, ***P < 0.001. compared to potential dual-tasking in free-living (e.g. carrying a book and turning). Additionally, a physical setting can change how someone mobilizes [34]: the clinical assessments were undertaken in a relatively narrow hallway, whereas the free-living data was taken from any of the more open downstairs communal rooms or the hallway. Work has been done by other researchers to replicate naturalistic conditions in the laboratory in gait evaluation [35], but further work is needed to control for external influences on gait parameters in design of clinical trials. For now, this work raises awareness that turn duration and number of turning steps may not be representative of how someone turns at home. The past few years have seen advances in how wearable sensors can automatically measure other aspects of gait and functioning including freezing of gait [36] and falls risk [37]. Mobility quality parameters, such as turn angle, can also be automatically quantified by wearable sensors [38], but real-world validation remains challenging, and few studies have shown technical validity of digital outcomes (often limited to laboratory or home-like environments) [18,23,39]. Beyond turning parameters, other gait parameters including those related to straight-ahead gait pace (including step velocity and length), variability, rhythm and asymmetry show promise in free-living as potential biomarkers in PD [21].
Importantly, the knowledge that the control participant cohort also altered the way they turned during clinical assessments has broader implications for other research groups looking at mobility. The impact of how the Hawthorne effect and other aspects of the clinical assessment, including the setting, changes turning outcomes should also be explored in other disease and non-disease groups.

Study strengths, limitations and future directions
This was a pilot study which had, as a particular strength, the use of multiple wall-mounted cameras collecting video recording unscripted and unobserved free-living behavior from people with PD in a home-like setting over multiple days. To the authors' knowledge, this is the first such work to do this. A further strength was in the amount of data captured, 85 h in total, meaning that the number of turns available for analysis was almost 4000. The turning annotations had almost perfect inter-rater agreement, resulting in a high-quality dataset which helps real-world PD symptom understanding.
We felt that wall-mounted cameras achieved a good visualization of each room, were minimally invasive to the study participants (although appreciating that in-home cameras may not be acceptable to all people with PD, the opinion of these participants was largely very positive [25]) and obviated the need for researchers to be present.
Although this pilot recruited participants with a wide variety of disease severities, the small sample size means that the results are not currently generalizable to the wider population of people with PD. A knowledge gap also remains around whether the same correlations would be seen between turning parameters and the clinical rating scale outcomes if only early-stage PD participants were studied. We hypothesize that larger sample sizes would be needed potentially over longer periods of time but that turning behaviors would correlate with disease severity outcomes.
As discussed above, the physical setting could be seen both as a study strength (a naturalistic setting as opposed to a laboratory environment) and a study limitation (relatively narrow hallway; unfamiliar setting for the participants which may have altered how they mobilized). Future work is needed in peoples' own homes with this scalable equipment, potentially capturing both free-living and structured assessments over longer periods of time, to get a true picture of how to characterize ecologically valid gait parameters for use as a marker of progression in disease-modifying therapeutic investigations.
The turning of gait parameters quantified were designed to be reproducible by clinicians watching similar videos in the future, including measures of quantity and quality of turning [20], but this approach to provide a ground truth with which to inform other sensors' data is burdensome in time and effort. Further work needs to be done to automate turning detection and quantification rather than using human-created annotations, even if these are a vital first component to achieving ecological validation of mobility parameters in the real world.

Conclusions
This work has shown that turning episodes occur frequently in freeliving and can be captured and annotated manually using video data with excellent inter-rater agreement. It has demonstrated that turning parameters can differentiate the "ON" dopaminergic medication state from the "OFF" state in PD in some patients, and between PD and control participant cohorts. The turning parameters from 5-days of real-world data correlated with the gold-standard clinical outcome of disease severity in PD, demonstrating their potential for use as markers of disease progression. Importantly, even within this same home-like setting, many participants turned differently when undertaking clinical assessments compared to when they mobilized in free-living. Further work is needed to understand the impact of the study setting and Hawthorne effect on how someone with PD turns. In aggregate, we feel the above encourages future scalability work including, crucially, developing ecologically validated automatic approaches to detect turning of gait and quantify turning parameters in-home.

Data statement
Data is not available at this point due to confidentiality of participants, but we hope to make the data available in due course as openly as our participant consent allows through the University of Bristol Data Repository Service.

Ethical compliance statement
Full approval from NHS Wales Research Ethics Committee 6 was granted on 17th of December 2019, and Health Research Authority and Health and Care Research Wales approval confirmed on 14th of January 2020; the research was carried out in accord with the Helsinki Declaration of 1975.
Written informed consent was gained from all study participants.